Fairness in Reinforcement Learning

نویسندگان

  • Shahin Jabbari
  • Matthew Joseph
  • Michael Kearns
  • Jamie Morgenstern
  • Aaron Roth
چکیده

We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning for Fair Dynamic Pricing

Unfair pricing policies have been shown to be one of the most negative perceptions customers can have concerning pricing, and may result in long-term losses for a company. Despite the fact that dynamic pricing models help companies maximize revenue, fairness and equality should be taken into account in order to avoid unfair price differences between groups of customers. This paper shows how to ...

متن کامل

Fair Learning in Markovian Environments

We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with ...

متن کامل

Dynamic Resource Allocation through Reinforcement Learning Approach in Multi-cell OFDMA Networks

In this paper, we present a distributed resource allocation algorithm for cellular OFDMA networks by adopting a Reinforcement Learning (RL) approach. We use an RL method which employ Growing Self Organizing Maps to deal with the huge and continuous problem space. The goal of the algorithm is to maximize the network throughput in a fair manner. Indeed, the algorithm maximizes the throughput unti...

متن کامل

Cycle Time Optimization of Processes Using an Entropy-Based Learning for Task Allocation

Cycle time optimization could be one of the great challenges in business process management. Although there is much research on this subject, task similarities have been paid little attention. In this paper, a new approach is proposed to optimize cycle time by minimizing entropy of work lists in resource allocation while keeping workloads balanced. The idea of the entropy of work lists comes fr...

متن کامل

Dowealth Differences Affect Fairness Considerations?

The influence of relative wealth on fairness considerations is analyzed in a series of ultimatum game experiments in which proposers and receivers are given large and widely unequal initial endowments. Subjects initially demonstrate a concern for fairness. With time however, the dynamics of behavior become at odds with both subgame perfection and fairness. Evidence of learning is detected for b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017